114 research outputs found

    PIE - the protein inference engine

    Get PDF
    Posttranslational modifications are vital to protein function but are hard to study, especially since several modification isoforms may be present simultaneously. Mass spectrometers are a great tool for investigating modified proteins, but the data they generate are often incomplete, ambiguous, and difficult to interpret. Combining data from multiple experimental techniques provides complementary information. Having both top-down (intact protein mass data) and bottom-up (peptide data) is especially valuable. In the context of background knowledge, combined data is used by human experts to interpret what modifications are present and where they are located. However, this process is arduous and for high-throughput applications needs to be automated. To explore a data integration methodology based on Markov chain Monte Carlo and simulated annealing, I developed the PIE (Protein Inference Engine). This java application integrates information using a modular approach which allows different types of data to be considered simultaneously and for new data types to be added as needed. Validation of the PIE was carried out using two realistically imperfect theoretical data sets. The first, based on the L7/L12 ribosomal protein, tested the limits of PIEs performance as intact mass accuracy and peptide coverage decreases. The second set, based on a mix of two modification variants of the H23c Histone protein, tested PIEs ability to handle isoform mixtures and up to eight simultaneous modifications. The PIE was then applied to analysis of experimental data from an investigation of the modification state of the L7/L12 ribosomal protein. This data consisted of a set of peptides identified as associated with some L7/L12 modification variant and nine intact masses measurements identified as an L7/ L12 modification variant. From this data, PIE was able to make consistent predictions, comparable to expert manual interpretation. Software, source code, user manuals, and demo projects replicating the analyses described in the following can be downloaded from http://pie.giddingslab.org/

    An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics

    Get PDF
    For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale. Analysis of clinicopathologic annotations for over 11,000 cancer patients in the TCGA program leads to the generation of TCGA Clinical Data Resource, which provides recommendations of clinical outcome endpoint usage for 33 cancer types

    Genomic, Pathway Network, and Immunologic Features Distinguishing Squamous Carcinomas

    Get PDF
    This integrated, multiplatform PanCancer Atlas study co-mapped and identified distinguishing molecular features of squamous cell carcinomas (SCCs) from five sites associated with smokin

    Pan-Cancer Analysis of lncRNA Regulation Supports Their Targeting of Cancer Genes in Each Tumor Context

    Get PDF
    Long noncoding RNAs (lncRNAs) are commonly dys-regulated in tumors, but only a handful are known toplay pathophysiological roles in cancer. We inferredlncRNAs that dysregulate cancer pathways, onco-genes, and tumor suppressors (cancer genes) bymodeling their effects on the activity of transcriptionfactors, RNA-binding proteins, and microRNAs in5,185 TCGA tumors and 1,019 ENCODE assays.Our predictions included hundreds of candidateonco- and tumor-suppressor lncRNAs (cancerlncRNAs) whose somatic alterations account for thedysregulation of dozens of cancer genes and path-ways in each of 14 tumor contexts. To demonstrateproof of concept, we showed that perturbations tar-geting OIP5-AS1 (an inferred tumor suppressor) andTUG1 and WT1-AS (inferred onco-lncRNAs) dysre-gulated cancer genes and altered proliferation ofbreast and gynecologic cancer cells. Our analysis in-dicates that, although most lncRNAs are dysregu-lated in a tumor-specific manner, some, includingOIP5-AS1, TUG1, NEAT1, MEG3, and TSIX, synergis-tically dysregulate cancer pathways in multiple tumorcontexts

    Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images

    Get PDF
    Beyond sample curation and basic pathologic characterization, the digitized H&E-stained images of TCGA samples remain underutilized. To highlight this resource, we present mappings of tumorinfiltrating lymphocytes (TILs) based on H&E images from 13 TCGA tumor types. These TIL maps are derived through computational staining using a convolutional neural network trained to classify patches of images. Affinity propagation revealed local spatial structure in TIL patterns and correlation with overall survival. TIL map structural patterns were grouped using standard histopathological parameters. These patterns are enriched in particular T cell subpopulations derived from molecular measures. TIL densities and spatial structure were differentially enriched among tumor types, immune subtypes, and tumor molecular subtypes, implying that spatial infiltrate state could reflect particular tumor cell aberration states. Obtaining spatial lymphocytic patterns linked to the rich genomic characterization of TCGA samples demonstrates one use for the TCGA image archives with insights into the tumor-immune microenvironment

    Pan-cancer Alterations of the MYC Oncogene and Its Proximal Network across the Cancer Genome Atlas

    Get PDF
    Although theMYConcogene has been implicated incancer, a systematic assessment of alterations ofMYC, related transcription factors, and co-regulatoryproteins, forming the proximal MYC network (PMN),across human cancers is lacking. Using computa-tional approaches, we define genomic and proteo-mic features associated with MYC and the PMNacross the 33 cancers of The Cancer Genome Atlas.Pan-cancer, 28% of all samples had at least one ofthe MYC paralogs amplified. In contrast, the MYCantagonists MGA and MNT were the most frequentlymutated or deleted members, proposing a roleas tumor suppressors.MYCalterations were mutu-ally exclusive withPIK3CA,PTEN,APC,orBRAFalterations, suggesting that MYC is a distinct onco-genic driver. Expression analysis revealed MYC-associated pathways in tumor subtypes, such asimmune response and growth factor signaling; chro-matin, translation, and DNA replication/repair wereconserved pan-cancer. This analysis reveals insightsinto MYC biology and is a reference for biomarkersand therapeutics for cancers with alterations ofMYC or the PMN

    FOXA1 and adaptive response determinants to HER2 targeted therapy in TBCRC 036

    Get PDF
    Inhibition of the HER2/ERBB2 receptor is a keystone to treating HER2-positive malignancies, particularly breast cancer, but a significant fraction of HER2-positive (HER2+) breast cancers recur or fail to respond. Anti-HER2 monoclonal antibodies, like trastuzumab or pertuzumab, and ATP active site inhibitors like lapatinib, commonly lack durability because of adaptive changes in the tumor leading to resistance. HER2+ cell line responses to inhibition with lapatinib were analyzed by RNAseq and ChIPseq to characterize transcriptional and epigenetic changes. Motif analysis of lapatinib-responsive genomic regions implicated the pioneer transcription factor FOXA1 as a mediator of adaptive responses. Lapatinib in combination with FOXA1 depletion led to dysregulation of enhancers, impaired adaptive upregulation of HER3, and decreased proliferation. HER2-directed therapy using clinically relevant drugs (trastuzumab with or without lapatinib or pertuzumab) in a 7-day clinical trial designed to examine early pharmacodynamic response to antibody-based anti-HER2 therapy showed reduced FOXA1 expression was coincident with decreased HER2 and HER3 levels, decreased proliferation gene signatures, and increased immune gene signatures. This highlights the importance of the immune response to anti-HER2 antibodies and suggests that inhibiting FOXA1-mediated adaptive responses in combination with HER2 targeting is a potential therapeutic strategy

    Integrative Genomic Analysis of Cholangiocarcinoma Identifies Distinct IDH -Mutant Molecular Profiles

    Get PDF
    Cholangiocarcinoma (CCA) is an aggressive malignancy of the bile ducts, with poor prognosis and limited treatment options. Here, we describe the integrated analysis of somatic mutations, RNA expression, copy number, and DNA methylation by The Cancer Genome Atlas of a set of predominantly intrahepatic CCA cases and propose a molecular classification scheme. We identified an IDH mutant-enriched subtype with distinct molecular features including low expression of chromatin modifiers, elevated expression of mitochondrial genes, and increased mitochondrial DNA copy number. Leveraging the multi-platform data, we observed that ARID1A exhibited DNA hypermethylation and decreased expression in the IDH mutant subtype. More broadly, we found that IDH mutations are associated with an expanded histological spectrum of liver tumors with molecular features that stratify with CCA. Our studies reveal insights into the molecular pathogenesis and heterogeneity of cholangiocarcinoma and provide classification information of potential therapeutic significance

    Driver Fusions and Their Implications in the Development and Treatment of Human Cancers.

    Get PDF
    Gene fusions represent an important class of somatic alterations in cancer. We systematically investigated fusions in 9,624 tumors across 33 cancer types using multiple fusion calling tools. We identified a total of 25,664 fusions, with a 63% validation rate. Integration of gene expression, copy number, and fusion annotation data revealed that fusions involving oncogenes tend to exhibit increased expression, whereas fusions involving tumor suppressors have the opposite effect. For fusions involving kinases, we found 1,275 with an intact kinase domain, the proportion of which varied significantly across cancer types. Our study suggests that fusions drive the development of 16.5% of cancer cases and function as the sole driver in more than 1% of them. Finally, we identified druggable fusions involving genes such as TMPRSS2, RET, FGFR3, ALK, and ESR1 in 6.0% of cases, and we predicted immunogenic peptides, suggesting that fusions may provide leads for targeted drug and immune therapy

    Comprehensive Molecular Characterization of Pheochromocytoma and Paraganglioma

    Get PDF
    SummaryWe report a comprehensive molecular characterization of pheochromocytomas and paragangliomas (PCCs/PGLs), a rare tumor type. Multi-platform integration revealed that PCCs/PGLs are driven by diverse alterations affecting multiple genes and pathways. Pathogenic germline mutations occurred in eight PCC/PGL susceptibility genes. We identified CSDE1 as a somatically mutated driver gene, complementing four known drivers (HRAS, RET, EPAS1, and NF1). We also discovered fusion genes in PCCs/PGLs, involving MAML3, BRAF, NGFR, and NF1. Integrated analysis classified PCCs/PGLs into four molecularly defined groups: a kinase signaling subtype, a pseudohypoxia subtype, a Wnt-altered subtype, driven by MAML3 and CSDE1, and a cortical admixture subtype. Correlates of metastatic PCCs/PGLs included the MAML3 fusion gene. This integrated molecular characterization provides a comprehensive foundation for developing PCC/PGL precision medicine
    • …
    corecore